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Human coronavirus OC43 (HCoV-OC43) is a causative agent of the common cold. The nucleocapsid (N) pro- 
tein, which is a major structural protein of CoVs, binds to the viral RNA genome to form the virion core and 
results in the formation of the ribonucleoprotein (RNP) complex. We have solved the crystal structure of the 
N-terminal domain of HCoV-OC43 N protein (N-NTD) (residues 58 to 195) to a resolution of 2.0 A. The 
HCoV-0C43 N-NTD is a single domain protein composed of a five-stranded B-sheet core and a long extended 
loop, similar to that observed in the structures of N-NTDs from other coronaviruses. The positively charged 
loop of the HCoV-OC43 N-NTD contains a structurally well-conserved positively charged residue, R106. To as- 
sess the role of R106 in RNA binding, we undertook a series of site-directed mutagenesis experiments and 
docking simulations to characterize the interaction between R106 and RNA. The results show that R106 
plays an important role in the interaction between the N protein and RNA. In addition, we showed that, in 
cells transfected with plasmids that encoded the mutant (R106A) N protein and infected with virus, the 
level of the matrix protein gene was decreased by 7-fold compared to cells that were transfected with the 
wild-type N protein. This finding suggests that R106, by enhancing binding of the N protein to viral RNA 
plays a critical role in the viral replication. The results also indicate that the strength of N protein/RNA inter- 
actions is critical for HCoV-OC43 replication. 

Crown Copyright © 2013 Published by Elsevier B.V. All rights reserved. 
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CoV particles have an irregular shape defined by an outer envelope 
with a distinctive club-like shape. Peplomers on the outer envelope 


1. Introduction 


The OC43 strain of the beta coronavirus family (HCoV-OC43), which 
was first identified in the 1960s, is responsible for ~20% of all “common 
colds” in humans [1,2]. Although HCoV-OC43 infections are generally 
mild, more severe upper and lower respiratory tract infections, such 
as bronchiolitis and pneumonia, have been documented, particularly 
in infants, elderly individuals, and immunocompromised patients 
[1,3,4]. Moreover, there have been reports that clusters of HCoV-OC43 
infections cause pneumonia in otherwise healthy adults [2,5]. Several 
studies have also reported that both neurotropism and neuroinvasion 
of HCoV, particularly the OC43 strain, are associated with multiple 
sclerosis [6]. 
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give the virus a crown-like (coronal) appearance [7]. The viral genomes 
of coronaviruses consist of approximately 30 kb of positive sense, 
single-stranded RNA. These genomes contain several genes encoding 
structural and nonstructural proteins that are required for the produc- 
tion of progeny virions |1]. The virion envelope that surrounds the 
nucleocapsid contains the S (spike), M (matrix), and E (envelope) struc- 
tural proteins. A third glycoprotein, HE (hemagglutinin-esterase), is 
present in most betacoronaviruses [8,9]. The virion contains a helical 
nucleocapsid, which consists of the N protein bound to viral RNA. The 
N protein is the major structural protein of CoVs [10-12]. The formation 
of the RNP is important for maintaining the RNA in an ordered confor- 
mation that is suitable for viral genome replication and transcription 
[13-16]. Previous studies have shown that the CoV N protein is involved 
in the regulation of cellular processes, such as gene transcription, actin 
reorganization, host cell cycle progression, and apoptosis [17-20]. It 
has also been shown to act as an RNA chaperone [21]. Moreover, the 
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N protein is an important diagnostic marker and is an antigen for the 
host antibody and T cell immune responses [16,22-24]. 

Previous studies have revealed that the N- and C-terminal do- 
mains of the CoV N proteins, including those of the SARS-CoV, murine 
hepatitis virus (MHV), and avian infectious bronchitis virus (IBV), 
are responsible for RNA binding and oligomerization, respectively 
[25-31]. The central region of the N protein has also been shown to 
contain an RNA-binding region and primary phosphorylation sites 
[32-34]. Phosphorylation of the N protein has been shown to play 
an important role in virus biology [35,36]. To clarify the molecular 
mechanism of RNP formation in CoVs, the structures of truncated 
fragments of the N protein, including the N-terminal and C-terminal 
domains, were investigated [27,37-39]. Despite the conservation of 
some motifs, the CoV N proteins from various different strains often 
exhibit different properties, due primarily to their low sequence 
homology [25]. 

The N protein of HCoV-OC43, which has a molecular weight of 
~50 kDa, is highly basic (pI, 10.0) and exhibits strong hydrophilicity 
[40]. It also shows only 26-30% amino acid identity to N proteins 
from other CoV strains [25]. In this study, because of its importance 
for RNA binding, we chose to characterize the structure of the 
N-terminus of the HCoV-OC43 N protein using X-ray crystallography. 
Using the crystal structure and the surface charge distribution of the 
N-terminus, we were able to further investigate the interactions be- 
tween the HCoV-OC43 N protein and the RNA molecule. Furthermore, 
we identified an important role of R106 in the binding of the 
HCoV-OC43 N protein to RNA using SPR analysis and site-directed mu- 
tagenesis. Finally, we present a structural model of the HCoV-OC43 N 
protein in complex with RNA, which clearly demonstrates the critical 
role of R106. 


2. Materials and methods 


All drugs and reagents were purchased from Sigma Chemical Co. 
(St. Louis, MO) unless otherwise specified. All oligoribonucleotides 
and oligodeoxyribonucleotides were synthesized using an automated 
DNA synthesizer and purified by gel electrophoresis. Biotin-linked 
oligomers were synthesized by incorporating the biotin synthon at 
the 5’-end; the oligomers were then immobilized to streptavidin- 
coated biosensor chips for the SPR experiments. 


2.1. Expression and purification of the full-length and truncated N proteins 


The templates for the HCoV-OC43 N protein were kindly provided 
by the Institute of Biological Chemistry, Academia Sinica (Taipei, 
Taiwan). To generate both a full-length versions and the N-terminal 
domain of the recombinant N protein, the N protein gene was ampli- 
fied by polymerase chain reaction (PCR) from a plasmid (pGENT) 
using various primers. The PCR products were digested with Ndel 
and Xhol, and the DNA fragments were cloned into pET28a (Novagen) 
using T4 ligase (NEB). The bacteria transformed with the resultant 
plasmid were grown in culture. The expression of the recombinant 
N proteins was induced by supplementing the culture medium with 
1 mM IPTG, followed by incubation at 10 °C for 24 h. After harvesting 
by centrifugation (8,000 g, 10 min, 4 °C), the bacterial pellets were 
lysed (50 mM Tris-buffered solution at pH 7.3, 150 mM NaCl, 0.1% 
CHAPS, and 15 mM imidazole). The soluble proteins were obtained 
from the supernatant following centrifugation (13,000 rpm, 40 min, 
4 °C). The methods used for the protein purification have been de- 
scribed previously [41]. Full-length and truncated N proteins carrying 
a Hisg-tag at the N-termini were purified using a Ni-NTA column 
(Novagen) with an elution gradient ranging from 15 to 300 mM imid- 
azole. Fractions were collected and dialyzed against low-salt buffer. 
The protein concentrations of the resulting samples were determined 
using the Bradford method with Bio-Rad protein assay reagents. 


2.2. Crystallization 


The initial crystallization experiments were set up using Qiagen crys- 
tal screens JCSG+ Suite and PACT Suite [42] using the sitting-drop 
vapor-diffusion method in accordance with our previously described 
protocol [43]. Each of the crystallization solutions (2 ul) obtained 
from the screen was mixed with 1.5 ul of purified protein solution 
(8 mg/ml) and 0.5 wl of 40% hexanediol at room temperature (~298 K) 
against 400 ul solution in each well of a Cryschem plate. The conditions 
were refined through seven cycles, and the crystals were grown in a so- 
lution containing 0.25 M SPG buffer (pH 6.0) and 25% PEG1500 and then 
equilibrated at 293 K against 400 ul of the precipitation solution. The 
SPG buffer was prepared by mixing succinic acid (Sigma), sodium 
dihydrogen phosphate, (Merck) and glycine (Merck) in a 2:7:7 molar 
ratio and then adjusting with sodium hydroxide to obtain a pH of 6.0 
[44]. The crystals appeared within two weeks, and the largest crystal 
grew to dimensions of approximately 200 x 100 x 100 um. The crystals 
were then soaked in reservoir solution containing 30% (v/v) glycerol as 
the cryoprotectant prior to being flash-cooled in a nitrogen-gas stream 
at 100 K. High resolution X-ray data were collected using a synchrotron 
radiation source. The complete dataset was collected at the beamline 
BL13B1 in the NSRRC using a ADSC Q315r detector. The crystallographic 
data integration and reduction were performed using the software pack- 
age HKL2000 [45]. The crystallographic statistics are listed in Table 1. The 
Matthews coefficient of 2.06 A?/Da, which was calculated using 
Matthews (Collaborative Computational Project, 1994) [46], suggested 
that this structure is likely to represent one molecule in an asymmetric 
unit. The solvent content was 40.26%. The N-terminal domain of the 
N protein obtained from the SARS-CoV (PDB ID: 20fz) was chosen as 
the initial search model due to its low E-value of 1 x 10~ 7°. The first 
molecular replacement trial was performed using the PERON automated 
interface at the Protein Tectonics Platform (PTP), RIKEN SPring-8 Center, 
Japan [47]. The best results were obtained using the MOLREP program 
[48]. A single and unambiguous solution for the rotation and translation 
function was found with the reflections in the resolution range of 
3.0-30 A, a final correlation coefficient of 0.79 and an R factor of 0.44. 
The structure was refined further using the Crystallography & NMR sys- 
tem (CNS) [49] and deposited in the Protein Data Bank (PDB ID: 4j3k). 


2.3. Site-directed mutagenesis 


The single mutants (R106A, R106K, R106Q, R106E, R107A, K110A, 
and R117A) were constructed using a QuikChange™ kit (Stratagene) 
with a plasmid containing an open reading frame that encodes the 
full-length HCoV-OC43 N protein as the template for mutagenesis. 


Table 1 
Crystallographic and refinement data for the HCoV-OC43 N-NTD. 


Data collection 
Wavelength (A) 1.0 
Space group P65 
Unit cell parameter (A, °) a=b = 8157,c = 4255 
a= b= 907.7 = 120° 


Resolution limit (A) 30-2.0 (2.07-2.00) 


Completeness (%) 99.8 (100) 

Unique reflections 11,131 (1086) 

Redundancy 8.2 (8.5) 

Riniee 0.037 (0.166) 

(I/o (1) ) 49.4 (13.77) 
Refinement statistics 

Reryst (%) 0.200 

Rfree (%) 0.214 

RMSD bond lengths (A) 0.013 

RMSD bond angles (°) 1.978 

Most favored region (%) 97.0 

Generally allowed region (%) 15 

Others (%) 15 

Average B-factor 43.8 
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The PCR reaction used Pfu DNA polymerase, and each cycle involved 
heating the sample at 95 °C for 30 s, 55 °C for 1 min, and 68 °C for 
2 min/kb of plasmid length; this sequence was repeated for a total 
of 16 cycles. The templates were digested with DpnI and transformed 
into E. coli XL-1 cells. All mutations were confirmed by automated se- 
quencing in both directions. 


2.4. Surface plasmon resonance (SPR) binding experiments 


The affinity, association, and dissociation of N proteins and RNA 
were measured in a BIAcore 3000A SPR instrument (Pharmacia, Upp- 
Sala, Sweden) equipped with a Sensor Chip SA5 (Pharmacia, Uppsala, 
Sweden). The apparatus measured binding by monitoring the change 
in the refractive index of the sensor chip surface. These changes, 
which were recorded in resonance units (RU), are generally assumed 
to be proportional to the mass of the molecules bound to the chip. The 
Surface was first washed three times by injecting 10 ul of a solution of 
100 mM NaCl and 50 mM NaOH. To control the amount of RNA (or 
DNA) bound to the SA chip surface, the biotinylated oligomer was 
immobilized manually onto the surface of a streptavidin chip until a 
signal of 20ORU was achieved in the first cell. The chip surface was 
then washed with 10 wl of 10 mM HCI to eliminate non-specific bind- 
ing. The second flow cell was unmodified and served as a control. The 
appropriate N proteins were dissolved in 50 mM Tris (pH 7.3) with 
150 mM NaCl and 0.1% CHAPS and passed over the chip surface for 
140 s at a flow rate of 30 wWl/min to reach equilibrium. A blank buffer 
solution was then passed over the chip to initiate the dissociation re- 
action; this step was continued for an additional 600 s until the reac- 
tion was complete. After 600 s, the surface was recovered by washing 
with 10 ul of 0.1% SDS for each single-stranded RNA. Before fitting to 
the 1:1 Langmuir model, the binding data were corrected by 
subtracting the control to account for simple refractive index differ- 
ences. The sensorgrams of the interactions between RNA and the pro- 
teins were analyzed using BIA evaluation software (version 3). 


2.5. RNA docking 


The computational docking of the ssRNA target onto the HCoV-OC43 
N-NTD was performed using HADDOCK1.3 (High Ambiguity Driven 
DOCKing) software [50]. The starting structures for docking were a 
model of the single-stranded RNA fragment (5’-UCUAAAC-3’), which 
was constructed with the DS module Biopolymer, and the crystal struc- 
ture of the HCoV-OC43 N-NTD. To monitor the interaction between 
R106 and RNA, R106 was defined as an active residue. The other active 
residues in the N protein, which included R122, Y124, Y126, and R164, 
were chosen based on previous studies [51-53] and the solvent accessi- 
bility (>50%) determined using the MOL-MOL program. The passive 
residues of the N protein were defined as the solvent-accessible surface 
neighbors of the active residues. Default parameters were used for each 
HADDOCK1.3 run. A total of 1000 docked structures were generated, 
and the best 100 were refined using simulated annealing. The model 
structure with the lowest energy was selected for further analysis and 
is Shown in the figures. 


2.6. Virus infection and real-time RT-PCR 


Real time RT-PCR was performed as previously described [54]. 
Briefly, 293T cells were cultured in DMEM culture media containing 
10% FBS (ATLANTA Biologicals), 1% NEAA (Invitrogen) and 10 uM 
B-mercaptoethanol (B-ME). A total of 3 x 10° 293T cells were seeded 
into each well of a 12-well plate one day before transfection. The cells 
were transfected with pcDNA3.1/NP using FuGENE 6 (Roche). After 
24 h, the media was removed and then the cells were infected with 
OC43 virus (MOI = 1) at 33 °C for 2 h in 250 ul DMEM with shaking 
every 10 min. At day 4 after infection, the media were removed, the 
cells were lysed in 1 ml Trizol (Invitrogen), RNA was extracted 


following the manufacturer's instructions, and 2 ug of the RNA was 
used as the template for cDNA synthesis. cDNA (2 ul) was added to 
23 ul of a PCR cocktail that contained 2 x SYBR Green Master Mix 
(ABI, Foster City, CA) and 0.2 uM of sense and antisense primers (IDT 
DNA, Coralville, IA). Amplification was performed in an ABI Prism 
7700 Thermocycler (ABI). The specificity of the amplification was con- 
firmed using dissociation curve analysis. The data were collected and 
recorded using the ABI Prism 7700 software and expressed as a func- 
tion of the threshold cycle (Ct), which is the cycle that the fluores- 
cence intensity in a given reaction tube rises above background 
(calculated as 10 times the mean standard deviation of the fluores- 
cence in all wells over the baseline cycles). The specific primer sets 
used for assaying the expression of OC43 matrix (MP) and the house- 
keeping gene GAPDH were: OC43 MP, Fwd-ATGTTAGGCCGATAA 
TIGAGGACTAT, Rev-AATGTAAAGATGGCCGCGTAT; GAPDH, Fwd-CCA 
CTCCTCCACCTTTIGA, Rev-ACCCTGTTGCTGTAGCCA. 


3. Results and discussion 
3.1. The crystal structure of HCoV-OC43 N-NTD 

The structure of the HCoV-OC43 N-NTD was determined by the mo- 
lecular replacement method using the crystal structure of the SARS-CoV 


N-NTD (PDB ID: 2ofz) as the template. The final protein structure 
(Fig. 1A) has R-factor and R-free values of 0.200 and 0.214, respectively 


Fig. 1. The structure and topology of the HCoV-OC43 N-NTD. (A) A ribbon diagram of 
the HCoV-OC43 N-NTD depicts the presence of five B strands, two 319 helices, and sev- 
eral disordered regions. (B) The topology of the HCoV-OC43 N-NTD shows the relative 
positions of the secondary structures of the truncated protein. (C) The surface charge 
distribution of the HCoV-OC43 N-NTD. 
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(Table 1). The core of the structure of HCoV-OC43 N-NTD is a 
five-stranded anti-parallel B-sheet sandwiched between loops (or 
Short 319 helix) (Fig. 1A and B). Strands B2 and B3 are connected by a 
long flexible loop composed of amino acid residues 105 to 120 protrud- 
ing out of the core. However, due to the flexibility of the loop, it was rel- 
atively difficult to locate its precise position, particularly within the aa 
range of 115-117, which is denoted by dotted lines. The MODELLER 
program was used to model the loop of the HCoV-OC43 N-NTD. The 


loop region contains four positively charged residues (Arg106, Arg107, 
Lys110 and Argi117). Based on the surface charge distribution 
(Fig. 1C), we proposed that the disordered loop of the HCoV-OC43 
N-NTD provides a site for binding of the phosphate backbone of RNA 
through electrostatic interactions. 

Previous studies have reported that the N-terminus of the SARS-CoV 
N protein provides a scaffold for RNA binding [29,37,55,56]. X-ray anal- 
ysis revealed that the fold of the N-terminal domain of the N protein is 


A 60 80 100 
HCoV-OC43 ---EFNVVPYYSWFSGITOFQKGKEFEFVEGQGVPIAPGVPATEAKGYWYRHNRRSFKTADGNOR 
--t---------- EEE--t------- eee t---HHHtEEEEEBE-------- t---- 
MHV -------- PHYSWFSGITQFQKGKEFQFAEGQGVPIANGI PASEQKGYWYRHNRRSFKTPDGQQK 
-------- BEE--t-------t-------t---HHHtEEEEEREE---EFE-t---EE 
SARS-CoV --SDKIHHNTASWFTALTQHGK-EELRFPRGQGVPINTINSGPDDQIGYYRRATRR-VRGGDGKMK 
-------------- BR-----------{-------t---HHHtEEEEEEE--FERE-t---EE 
IBV RPPKVGSSGNASWFQAIKAKKLNSPQPKFEGSGVPDNENLKTSQQHGYWRRQAR- - FKPGKGRRK 
------- t--------EEE--t-------t-------t---HHHtEEEEEEEEEE EE-ttt--E 
120 140 160 180 


QLLPRWYFYYLGTGPHAKDOYGTDIDGVYWVASNQADVNT PAD ~- IVDRDPSSDEAIPTRFPPGTV 


----EEEEEEt--HHHt---t----t 


-REEFE-t---t----tt----- tt---------- ‘oe 


QLLPRWYFYYLGTGPHAGASYGDSIEGVFWVANSQADTNTRSD~IVERDPSSHEAIPTRFAPGTV 


E---EEEEEEt--tt-t---t----t- 


-~-EEEBE-t---t----------- ee a 


ELSPRWYFYYLGTGPEASLPYGANKEGIVWVATEGA~-LNTPKDHIGTRNPNNNAATVLOLPOGTT 


E---EEEEEEt--tt-t---t----t 


-EEEEE-t----t--ttt----- Bi tinh cic as nan re ee 


PVPDAWYFYYTGTGPAADLNWGDSQDGIVWVAAKGADVKSRSN-QGTRD PDKFDQOYPLRFSDGG- 


EEEEEEEEREt=-tt<-t---t----t- 


LPQGYYIEGS--- 
--t-EEEt--- 
LPQGFYVEGS- - - 
--t-EEEt--- 
LPKGFYA------ 
Se ee 
~PDGNFRWDFIPL 


-BEBREB-t---t----------- Te ee De erechates 


IBV HCoV-229F 


Fig. 2. Amino acid sequences of N-NTDs from various coronaviruses. (A) Secondary structural alignment of the amino acids in N-NTDs from HCoV-OC43, SARS-CoV (PDB ID: 2ofz), 
IBV (PDB ID: 2gec), and MHV (PDB ID: 3hd4). (B) Superimposition of the HCoV-OC43 N-NTD (green) with N-NTDs from SARS-CoV (magenta), IBV (yellow), and MHV (blue). 
(C) Surface charge distribution of N-NTDs from HCoV-OC43, SARS-CoV, IBV, and HCoV-229E. The structure of HCoV-229E N-NTD was modeled in the Modeler program using 


the SARS-CoV N-NTD as a template. 
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essentially conserved across the various CoVs. It has a right-handed 
fist-shaped structure, in which palm and finger are rich in basic residues 
and the flexible loops are organized around the B-sheet core of the 
N-terminal domain [27,37]. Spencer and Hiscox found that the 
N-terminal region of the IBV protein facilitates long-range, non- 
specific interactions between the N protein and viral RNA, thus leading 
to the formation of the ribonucleocapsid via a lure and lock mechanism 
[29]. There are multiple RNA binding sites on the N protein and they 
bind to RNA cooperatively [57]. Similar to the N-terminal domain of 
the SARS-CoV and IBV N proteins, the N-terminal domain of the 
HCoV-OC43 N protein was able to bind to RNA. In addition to the 


0c 43 
SARS : 
229E ; 
IBV 
MHV 


0C 43 
SARS : 
229E : 
IBV 
MHV 


0C43 
SARS 
229E 
IBV 
MHV 


0Cc43 : IE---GS 
SARS : GEY------ A 


229E : EPDSR---AP 


IBV : RW---DF 
MHV : EGSGRS 


VNEPADIV 
LNT PRDHEIG 


VKSRSNOGT 
ITNTRSDIVE 


SSDEAIP---TREP 


KTEPTGY-GVRKNSE PE 


SSHEA 


N-terminal domain, the highly positively charged portions of the central 
linker region also show RNA binding affinity, whereas the C-terminal 
domain of the HCoV-OC43 N protein does not bind RNA [58]. 


3.2. Analysis and comparison of the structures of HCoV N-NTD 


The secondary structure of the HCoV-OC43 N-NTD was compared 
to those of N-NTDs from other coronaviruses, e.g., MHV, SARS-CoV, 
and IBV, using the Sequence Annotated by Structure (SAS) website. 
The differences in the distribution of the secondary structures of 
these coronaviruses are not obvious, as shown in Fig. 2A [28,37,52], 


AGAS 


NNAATV---LOLP 


TPHENOKLP 
— SD 
---TREA 


KF DQ 


Fig. 3. (A) Amino acid sequence alignment, performed using T-coffee, of N-NTDs from HCoV-OC43 (NP_937954), SARS-CoV (ABI96968), HCoV-229E (AAG48597), IBV (AAB24054), 
and MHV (ACO72897), all of which were retrieved from GenBank. Residue 106 is indicated with an asterisk. (B) Structural superimposition of the HCoV-OC43 N-NTD (green) with 
N-NTDs from SARS-CoV (magenta) (PDB ID: 2ofz), IBV (yellow) (PDB ID: 2gec), and MHV (blue) (PDB ID: 3hd4). The inset depicts the location of R106. 
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which implies that N-NTDs are highly conserved among various 
coronaviruses. However, superimposition of the structures of the 
HCoV-0C43, MHV, SARS-CoV, and IBV N-NTDs (Fig. 2B) showed that 
the structures of the HCoV-OC43 N-NTD are quite similar to those of 
the MHV, SARS-CoV, and IBV N-NTDs because it contains a protruding 
loop within the B-sheet core. However, in the MHV, SARS-CoV, and 
IBV N-NTDs the protruding segment comprised a B-hairpin, whereas 
a flexible loop predominates in the HCoV-OC43 N-NTD. More signifi- 
cantly, as pointed out by Saikatendu et al., the surface charge distribu- 
tion, as shown in Fig. 2C, is also quite different for N-NTDs from 
different species [37]. One common feature, however, is that they all 
possess positively charged amino acids on or near the protruding 
loop. NMR studies by Huang et al. have shown that the positively 
charged loop region is the RNA binding site in SARS-CoV N-NTD 
[55]. MHV, IBV, and HCoV-OC43 N-NTDs also possess similar positive- 
ly charged amino acids in this area, thus it is likely that this area may 
also bind to the negatively charged phosphate backbone of RNA via 
electrostatic interactions [28,37,52,55]. Nonetheless, the patterns of 
surface charge distributions of HCoV-OC43, SARS-CoV, IBV, and 
HCoV-229E differ significantly, suggesting that the RNP packaging 
may be quite different, although they exhibit short stretches of locally 
conserved sequences with similar electrostatic sequence profiles 
(Fig. 2C). In fact, RNP core is self-packaged differently among various 
morphologically distinct nidoviruses [37]. 


3.3. RNA binding activity analyses of HCoV-OC43 WT and 
mutant N proteins 


The T-coffee package was used to compare the N-NTD sequences 
of HCoV-OC43 (NP_937954), SARS-CoV (ABI96968), HCoV-229E 
(AAG48597), IBV (AAB24054), and MHV (ACO72897). The results 
revealed that the R106 residue is highly conserved among these 
coronaviruses (Fig. 3A). In addition, structural alignment demon- 
strated that R106 is located within a positively charged region and 
protrudes from the protein surface (Fig. 3B). Therefore, we predicted 
that R106 would play an indispensible role in the interactions be- 
tween the N protein and RNA. To explore the role of R106 in the interac- 
tions between HCoV-OC43 and RNA, the R106 residue in the full-length 
N protein was replaced by amino acids with various characteristics, e.g., 
K, E, M, Q, and A, via site-directed mutagenesis. SPR analyses were then 
performed to measure the binding affinities between RNA and the 
HCoV-OC43 N proteins (WT and five mutants). A stretch of single- 
stranded RNA, 5’-bio( UCUAAAC),-3’, was immobilized manually on 
the surface of the SPR chip. This 28-mer RNA, located in the 5’-end 
non-translated core sequence, was shown to be very important for coro- 
navirus replication and binding to the N protein [59]. Traces from the SPR 
experiments (Fig. 4A) show the binding capacity of RNA to the WT and 
mutant N proteins. Comparison of the binding capacities of these pro- 
teins (Fig. 4B) showed the binding capacity decreasing in the following 
order: WT ~ R106K > R106Q ~ RIOGE > R106M > R106A. These re- 
sults emphasize the importance of the positively charged R106 in the in- 
teraction between the N protein and RNA. The kinetic association and 
dissociation constants, k, and kg, respectively, and the dissociation rate 
constant, Kg (kg/k,), were obtained from analyses of the SPR sensorgrams 
and the results are listed in Table 2. As shown in Fig. 5A, the k, values 
followed the order WT ~ R106K > R106Q ~ RIO6E > R106M > R106A, 
with the kj, of RIOGA smaller than that of the WT by 47.3%. On 
the other hand, the kg values increased in the following order: 
R106A ~ R106M > R106E ~ R106Q > R106K > WT (Fig. 5B) with kg of 
R106A being 2.5 times larger than that of WT. Taken together, the Kgs 
for the HCoV-OC43 WT and 5 mutant N proteins decreased such as 
R106A > R1O06M > RIOGE > R106Q > R106K ~ WT (Table 2). The in- 
teraction between R106 of the HCoV-OC43 N-NTD and RNA was 
modeled. The results of the docking simulation, as shown in Fig. 6, 
showed that the interaction between R106 and RNA is due to the forma- 
tion of two hydrogen bonds between the NnH group of R106 and the 
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Fig. 4. (A) SPR traces from an RNA binding affinity assay assessing WT HCoV-OC43 N 
protein and five mutant N proteins. (B) Binding capacity of the WT and mutant N pro- 
teins at 0.1 uM for RNA. 


OH groups on the C2 and C3 of the furanose ring in RNA. The é€N of 
R106 also forms hydrogen bond with 5’-phosphate group of adenine 
residue in RNA. 

To identify the role of R63 in HCoV-229E N proteins, which corre- 
sponds to R106 in HCoV-OC43, in RNA-binding, the R63 residue in 
the HCoV-229E N protein was replaced by valine via site-directed 
mutagenesis. SPR analyses were then performed to measure the bind- 
ing affinities between RNA and the HCoV-OC43 N proteins (WT and 
R63V). The k, values were 4.03 x 10% and 3.30 x 104*M~!s7! for 
WT and R63V, respectively. Moreover, the kg values of WT and R63V 
were 5.87 x 10-4 and 8.33 x 104s” ', respectively. The Kz value of 
the WT was ~2-fold lower than that of R63V. These results suggested 


Table 2 
Numerical k,, kg, and Kg values obtained from the kinetic analysis of the SPR experi- 
ments examining binding of HCoV-OC43 WT and mutant N proteins to RNA. 


ka (M~! s~')/104 kas 310-7 Ka (nM) 
WT 5.01 + 0.23 5.84 + 0.31 11.7 + 0.82 
R106A 2.64 + 0.19 14.8 + 0.61 56.1 + 4.64 
R106E 3.74 + 0.28 13.1 + 0.48 35.0 + 2.92 
R106K 5.37 + 0.31 7.05 + 0.32 13.1 + 0.90 
R106M 3.43 + 0.21 14.6 + 0.51 42.6 + 3.01 
R106Q 4.06 + 0.20 11.5 + 0.45 28.3 + 1.78 
R107A 4.67 + 0.24 8.81 + 0.43 18.9 + 1.33 
K110A 5.50 + 0.22 6.29 + 0.45 11.4 + 0.93 
R117A 3.24 + 0.18 5.63 + 0.34 17.4 + 1.42 
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Fig. 5. Kinetic analyses of HCoV-OC43 WT and mutant N proteins binding to RNA: 
(A) kg and (B) kg. 


that R63 in HCoV-229E N proteins also play an important role in 
RNA-binding. 

To examine the effects of other conserved positively charged resi- 
dues in the positively charged loop of N protein including R107, K110 


Fig. 6. Predicted interactions between RNA and R106 in the HCoV-OC43 N-NTDs. A 
single-stranded RNA, 5’-UCUAAAC-3’, was generated using the DS module in biopolymer; 
the subsequent docking calculations were performed using HADDOCK 1.3 software. 
The €N of R106 forms hydrogen bond with 5’-phosphate group of adenine residue in 
RNA (cyan dashed lines). Additionally, the nN of the Arg106 guanidinium side chain 
also forms hydrogen bonds with ribose 2’-hydroxyl group and O03’ atom of RNA (black 
dashed lines). Nitrogen and oxygen atoms were represented by blue ball and red ball, 
respectively. 
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Fig. 7. Virus replication by expression of wild type and R106A N proteins in trans. Cells 
were transfected with plasmid, pcDNA3.1, encoding (A) the WT N protein or (B) the 
mutant (R106A) N protein and then infected with HCoV-OC43 as described in Materials 
and Methods section. Samples were then analyzed for levels of Matrix protein (MP) 
gene transcript. (C) No transfection and (D) no infection controls are also shown. Quanti- 
tative data are shown as mean + S.D.,n = 3. 


and R117, in RNA-binding, we have created three individual mutants, 
R107A, K110A, and R117A, of HCoV-OC43 via site-directed mutagen- 
esis and tested their RNA-binding affinity as detailed in this study 
(Table 2). The k, values for RNA binding to R107A and R110A are sim- 
ilar to that for WT; the R117A mutant had the smallest k, value (65%). 
The kg values for RNA binding to R117A and R110A are similar to that 
for WT; the R107A mutant had a larger k, value (1.5-fold). The Kg 
values of WT and R110A were essentially the same, while they were 
increased in R117A and R107A, suggesting that R117 and R107 may 
participate in RNA-binding of HCoV-OC43 N proteins via electrostatic 
interactions. 


3.4, The effects of RI06A mutant on the viral replication 


Tan et al. previously demonstrated that R76 in IBV, which corre- 
sponds to R106 in HCoV-OC43, plays a functional role on IBV infectiv- 
ity [53]. In order to determine the effects of R106A mutant on the viral 
replication, we monitored levels of viral RNA encoding the MP. Since 
all OC43 genes are transcribed concordantly throughout HCoV-OC43 
infection, levels of the MP gene should reflect virus replication. For 
this purpose, 293T cells were transfected with plasmids encoding 
the mutant (R106A) N protein and its WT counterpart followed by in- 
fection with HCoV-OC43. As shown in Fig. 7, in cells transfected with 
plasmids encoding the WT N protein and infected with virus, RNA 
levels of MP were 7-fold increased compared to those detected in 
cells transfected with plasmids encoding the mutant N protein 
(R106A), suggesting that expression of WT N protein in trans could 
stimulate viral replication, and the stimulatory effect of N protein 
with R106A was obviously impaired. The enhancement of viral repli- 
cation by expression of CoV N protein in trans was also reported pre- 
viously [60]. These results support the notion that R106 plays a 
critical role in virus replication. 


4. Conclusion 


Coronaviruses (CoVs) cause a wide spectrum of upper and lower re- 
Spiratory tract infections affecting humans and animals. The strain OC43 
identified in 1960s | 1,2] and the SARS-CoV identified in 2003 [61] are 
among the CoVs known to be potential worldwide health risks for 
human. In CoVs, N protein plays an important role in the packaging of 
the RNA genome into a helical RNP and in viral RNA synthesis, including 
replication and transcription. In the present study, we report the crystal 
structure of the HCoV-OC43 N-NTD, which includes residues 58 to 195. 
The structure of HCoV-OC43 N-NTD consists of a single-domain five 
(-sheet structure with a long disordered loops protruding outward, 
similar to those of the N-NTDs from other coronaviruses. Similar to 
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that found in IBV [51] by using site-directed mutagenesis, docking stim- 
ulation, SPR, and a viral replication assay, we demonstrated that R106 of 
HCoV-OC43 played a crucial role in RNA binding and virus replication. 
Therefore, this study could facilitate the development of drugs to dis- 
rupt the interaction between the N proteins of CoVs and RNA. 
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